skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Zhou, Naitian"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Narrative film is a composition of writing, cinematography, editing, and performance. While much computational work has focused on the writing or visual style in film, we conduct in this paper a com- putational exploration of acting performance. Applying speech emotion recognition models and a vari- ationist sociolinguistic analytical framework to a corpus of popular, contemporary American film, we find narrative structure, diachronic shifts, and genre- and dialogue-based constraints located in spoken performances. 
    more » « less
  2. In this work, we survey the way in which classification is used as a sensemaking practice in cultural analytics, and assess where large language models can fit into this landscape. We identify ten tasks supported by publicly available datasets on which we empirically assess the performance of LLMs compared to traditional supervised methods, and explore the ways in which LLMs can be employed for sensemaking goals beyond mere accuracy. We find that prompt-based LLMs are competitive with traditional supervised models for established tasks, but perform less well on de novo tasks. In addition, LLMs can assist sensemaking by acting as an intermediary input to formal theory testing. 
    more » « less
  3. Much work in the space of NLP has used computational methods to explore sociolinguistic variation in text. In this paper, we argue that memes, as multimodal forms of language comprised of visual templates and text, also exhibit meaningful social variation. We construct a computational pipeline to cluster individual instances of memes into templates and semantic variables, taking advantage of their multimodal structure in doing so. We apply this method to a large collection of meme images from Reddit and make available the resulting SEMANTICMEMES dataset of 3.8M images clustered by their semantic function. We use these clusters to analyze linguistic variation in memes, discovering not only that socially meaningful variation in meme usage exists between subreddits, but that patterns of meme innovation and acculturation within these communities align with previous findings on written language. 
    more » « less
  4. null (Ed.)
    Offering condolence is a natural reaction to hearing someone’s distress. Individuals frequently express distress in social media, where some communities can provide support. However, not all condolence is equal—trite responses offer little actual support despite their good intentions. Here, we develop computational tools to create a massive dataset of 11.4M expressions of distress and 2.8M corresponding offerings of condolence in order to examine the dynamics of condolence online. Our study reveals widespread disparity in what types of distress receive supportive condolence rather than just engagement. Building on studies from social psychology, we analyze the language of condolence and develop a new dataset for quantifying the empathy in a condolence using appraisal theory. Finally, we demonstrate that the features of condolence individuals find most helpful online differ substantially in their features from those seen in interpersonal settings. 
    more » « less